refactor: improve readFileTool XML output format #2340

KJ7LNW · 2025-04-06T04:17:44Z

Context

This PR improves the XML output format in readFileTool.ts to make it more structured and less prone to misinterpretation by the model.

Implementation

Removed unnecessary XML indentation that could confuse the model
Separated file content from notices and errors using dedicated tags
Added line range information to content tags
Handled empty files properly with self-closing tags
Added comprehensive test coverage
Fixed carriage return handling in readLines()

Test Examples

Example 1: Definitions Only (maxReadFileLine=0)

<file><path>package.json</path><notice>Showing only 0 of 66 total lines. Use start_line and end_line if you need to read more</notice><list_code_definition_names>

# package.json
1--66 | {
20--45 |   "contributes": {
21--26 |     "commands": [
22--25 |       {
27--35 |     "viewsContainers": {
28--34 |       "activitybar": [
...
</list_code_definition_names></file>

Example 2: Content with Line Range

<file><path>package.json</path><content lines="1-66">
1 | {
 2 |   "name": "vsce-test-terminal-integration",
 3 |   "displayName": "Terminal Command Runner",
 4 |   "description": "Test VSCode terminal shell integration sequence race condition between parsing and AsyncIterable consumers",
 5 |   "version": "0.0.1",
 6 |   "engines": {
...
 </lst_code_definition_names></file>

Example 3: Empty File

<file><path>t</path><content/><notice>File is empty</notice></file>

Example 4: Error Handling

<file><path>src/nonexistent-file.ts</path><error>Error reading file: File not found: src/nonexistent-file.ts</error></file>
[read_file for 'src/nonexistent-file.ts'] Result:
The tool execution failed with the following error:
<error>
Error reading file: {"name":"Error","message":"File not found: nonexistent-file.ts",...}
</error>

How to Test

Run the existing tests
Try reading files with different line ranges and verify the XML output format

Get in Touch

Discord: KJ7LNW

Fixes #2278
cc: @hannesrudolph

Important

Refactor readFileTool.ts to improve XML output structure, add error handling, and enhance test coverage.

Behavior:
- Refactor XML output in readFileTool.ts to improve structure and clarity.
- Separate file content from notices and errors using dedicated tags.
- Add line range information to content tags.
- Handle empty files with self-closing tags.
- Add error handling for invalid paths, start_line, end_line, and RooIgnore errors.
Tests:
- Add comprehensive test coverage in read-file-maxReadFileLine.test.ts, read-file-xml.test.ts, and extract-text.test.ts.
- Test cases for different maxReadFileLine values, binary files, and range parameters.
- Validate XML structure, line range attributes, and error handling.
Functions:
- Modify addLineNumbers() in extract-text.ts to handle trailing newlines and empty content correctly.
- Update readLines() in read-lines.ts to process data chunks directly and handle edge cases.

^{This description was created by}^{for 1c7d1b24f260be734f7e4c9c227346375348d8f1. It will automatically update as commits are pushed.}

changeset-bot · 2025-04-06T04:17:47Z

⚠️ No Changeset found

Latest commit: 5413e47

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

hannesrudolph · 2025-04-06T05:47:11Z

@mrubens

Empty files should not have line numbers, but non-empty files with empty content at a specific line offset should. - If content is empty, return empty string for empty files - If content is empty but startLine > 1, return line number for empty content at that offset This ensures that the model does not think the file contains a single empty line. Signed-off-by: Eric Wheeler <[email protected]>

- Remove unnecessary XML indentation that could confuse the model - Separate file content from notices and errors using dedicated tags - Add line range information to content tags - Handle empty files properly with self-closing tags - Add comprehensive test coverage Fixes #2278 Signed-off-by: Eric Wheeler <[email protected]>

- Always display line numbers in non-range reads - Improve XML formatting with consistent newlines for better readability Signed-off-by: Eric Wheeler <[email protected]>

- Update test expectations to match the new XML format with newlines - Update tests to expect line numbers attribute in content tags - Modify test assertions to check for the correct line range values Signed-off-by: Eric Wheeler <[email protected]>

- Add newline to all output - Handle trailing newlines and empty lines consistently - Add test cases for blank lines: - Multiple blank lines within content - Multiple trailing blank lines - Only blank lines with offset - Trailing newlines Signed-off-by: Eric Wheeler <[email protected]>

- Modified extract-text mock to preserve actual addLineNumbers implementation - Removed mock implementation of addLineNumbers - Updated test data to account for trailing newline - Removed unnecessary mock verification Signed-off-by: Eric Wheeler <[email protected]>

- Replace direct mocking of addLineNumbers with spy on actual implementation - Add verification to ensure the real function is called when appropriate - Add skipAddLineNumbersCheck option for cases where function should not be called - Update test cases to use appropriate verification options - Fix numberedFileContent to include trailing newline for consistency Signed-off-by: Eric Wheeler <[email protected]>

KJ7LNW · 2025-04-08T22:29:23Z

This turned into a bit of a wild goose chase when I discovered that content is not always being read correctly particularly with multiple carriage returns at the end of a file. This has been corrected as part of this pull request in 8c9caa2

I believe this is ready for review and merge.

KJ7LNW · 2025-04-08T22:32:30Z

one more test to fix ...

src/core/__tests__/read-file-maxReadFileLine.test.ts

src/core/tools/readFileTool.ts

- Direct data processing provides more accurate results by preserving exact content with carriage returns - Improved performance through minimal buffering and efficient string operations - Use string indexes to find newlines while maintaining their original format - Handle all edge cases correctly with preserved line endings - Add tests for various edge cases including empty files, single lines, and different line endings Signed-off-by: Eric Wheeler <[email protected]>

mrubens · 2025-04-08T22:36:26Z

Great! Will look soon

Remove unused variable declaration to appease ellipsis-dev linter requirements. Signed-off-by: Eric Wheeler <[email protected]>

KJ7LNW · 2025-04-08T22:48:39Z

all tests are passing, this is ready for review

github-project-automation bot added this to Roo Code Roadmap Apr 6, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Apr 6, 2025

hannesrudolph moved this from New to PR [Pre Approval Review] in Roo Code Roadmap Apr 6, 2025

Eric Wheeler added 7 commits April 8, 2025 15:26

fix: always show line numbers in read_file XML output

6add315

- Always display line numbers in non-range reads - Improve XML formatting with consistent newlines for better readability Signed-off-by: Eric Wheeler <[email protected]>

KJ7LNW marked this pull request as ready for review April 8, 2025 22:29

KJ7LNW requested review from cte and mrubens as code owners April 8, 2025 22:29

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Apr 8, 2025

ellipsis-dev bot reviewed Apr 8, 2025

View reviewed changes

src/core/__tests__/read-file-maxReadFileLine.test.ts Outdated Show resolved Hide resolved

src/core/tools/readFileTool.ts Outdated Show resolved Hide resolved

test: remove unused mockInputContent variable

5413e47

Remove unused variable declaration to appease ellipsis-dev linter requirements. Signed-off-by: Eric Wheeler <[email protected]>

mrubens approved these changes Apr 9, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 9, 2025

mrubens merged commit 270fd88 into RooCodeInc:main Apr 9, 2025
12 checks passed

github-project-automation bot moved this from PR [Pre Approval Review] to Done in Roo Code Roadmap Apr 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: improve readFileTool XML output format #2340

refactor: improve readFileTool XML output format #2340

Uh oh!

KJ7LNW commented Apr 6, 2025 •

edited

Loading

Uh oh!

changeset-bot bot commented Apr 6, 2025 •

edited

Loading

Uh oh!

hannesrudolph commented Apr 6, 2025

Uh oh!

KJ7LNW commented Apr 8, 2025 •

edited

Loading

Uh oh!

KJ7LNW commented Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

mrubens commented Apr 8, 2025

Uh oh!

KJ7LNW commented Apr 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

refactor: improve readFileTool XML output format #2340

refactor: improve readFileTool XML output format #2340

Uh oh!

Conversation

KJ7LNW commented Apr 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Implementation

Test Examples

Example 1: Definitions Only (maxReadFileLine=0)

Example 2: Content with Line Range

Example 3: Empty File

Example 4: Error Handling

How to Test

Get in Touch

Uh oh!

changeset-bot bot commented Apr 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

hannesrudolph commented Apr 6, 2025

Uh oh!

KJ7LNW commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KJ7LNW commented Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

mrubens commented Apr 8, 2025

Uh oh!

KJ7LNW commented Apr 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

KJ7LNW commented Apr 6, 2025 •

edited

Loading

changeset-bot bot commented Apr 6, 2025 •

edited

Loading

KJ7LNW commented Apr 8, 2025 •

edited

Loading